-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add support for automatic naming of random variables #6364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7005ec9
to
c6df184
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6364 +/- ##
==========================================
- Coverage 94.72% 94.71% -0.02%
==========================================
Files 132 132
Lines 26740 26789 +49
==========================================
+ Hits 25330 25372 +42
- Misses 1410 1417 +7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty wild 😜 This will require group approval for sure but could make it to PyMC5 if it goes through.
Does it handle assignment in list comprehension?
pymc/distributions/distribution.py
Outdated
value = frame.f_locals[var_name] | ||
else: | ||
value = None | ||
if type(value) is int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for container[i] = pm.Normal(0, 1)
?
Missing a test if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested assignment in a list comprehension. Can you include a short example?
Lots of the complexity in _extract_target_of_assignment
is for supporting a wider variety of things you can assign to like container[i]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I was asking what this integer case was for.
For list comprehension I was thinking normals = [pm.Normal() for i in range(5)]
Not saying we need to support it, just asking if it would be possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integer case is for assigning to tuples I believe. The list comprehension case does not work, but we could probably modify _extract_target_of_assignment
to detect that situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should hold off on supporting it. Generating unique names for comprehensions that are predictable to the user looks tricky to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integer case is for assigning to tuples I believe.
Can you add a test?
c6df184
to
4b592b1
Compare
@pymc-devs/dev-core Weight in :) |
Amazing! |
I remember that I didn't like this pattern before, because it messes up the typing of the arguments =/ I know that the typing of distribution arguments isn't great in the first place, but we're also talking about refactoring all these to functions. How would the signature of Can this be done with overloads? |
As you mentioned this doesn't change the typing too much as is. If we were to change the distributions to functions, I think this could be handled with an overload. One with a explicit name argument, and one without. Most of the hackiness of this comes from trying to maintain full backwards-compatibility. If we were ok with an API break, this just gets typed with |
name = args[0] | ||
args = args[1:] | ||
else: | ||
name = _extract_target_of_assignment(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For situations where automated name determination fails, this part should raise an informative error.
For example, one might simply forget to pass the name:
with pmodel:
pm.Normal("Y", pm.Uniform(), observed=2)
There are exceptions a few lines downstream, but here we should do a try/except
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or even better: make _extract_target_of_assignment
raise the informative error and unit test that directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. What do you mean by an informative error message? I'm not sure what more can I say beyond that I failed to infer the name, and in the _extract_target_of_assignment
that no assignment instruction was found at that frame depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something about "inferring the name from the outer code"
if you have a handle on the line of code print that
definitely instructions what do do about it: please pass a name as the first argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed. Does that look ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the message looks good, but I'd recommend to raise it in the extraction function instead of returning None
. It gives a cleaner signature (str or raise) and separation of responsibility..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the error message changes. Are we ok with an Assignment not found error that is raised, caught, and re-raised as a Name needs to be provided error message?
4b592b1
to
38ef6cc
Compare
Very cool. We have been wanting to infer names since the days of PyMC2. |
68e3796
to
97b1f58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zaxtax something like this
or maybe keep the try except outside, but raise instead of returning none
if "name" in kwargs: | ||
name = kwargs.pop("name") | ||
elif len(args) > 0 and isinstance(args[0], string_types): | ||
name = args[0] | ||
args = args[1:] | ||
else: | ||
name = _extract_target_of_assignment(2) | ||
if name is None: | ||
raise TypeError( | ||
"Name could not be inferred for variable from surrounding " | ||
"context. Pass a name explicitly as the first argument to " | ||
"the Distribution." | ||
) | ||
|
||
if not isinstance(name, string_types): | ||
raise TypeError(f"Name needs to be a string but got: {name}") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "name" in kwargs: | |
name = kwargs.pop("name") | |
elif len(args) > 0 and isinstance(args[0], string_types): | |
name = args[0] | |
args = args[1:] | |
else: | |
name = _extract_target_of_assignment(2) | |
if name is None: | |
raise TypeError( | |
"Name could not be inferred for variable from surrounding " | |
"context. Pass a name explicitly as the first argument to " | |
"the Distribution." | |
) | |
if not isinstance(name, string_types): | |
raise TypeError(f"Name needs to be a string but got: {name}") | |
name = kwargs.pop("name", None) | |
if name is None: | |
if args and isinstance(args[0], string_types): | |
# Name was provided as the first argument | |
name = args[0] | |
args = args[1:] | |
else: | |
# Try to infer name from the outer context. | |
# This may raise an error, but then there's nothing else we can do. | |
name = _extract_target_of_assignment(2) | |
if not isinstance(name, string_types): | |
raise TypeError(f"Name needs to be a string but got: {name}") | |
# Helper function from pyprob | ||
def _extract_target_of_assignment(depth): | ||
frame = sys._getframe(depth) | ||
code = frame.f_code | ||
next_instruction = code.co_code[frame.f_lasti + 2] | ||
instruction_arg = code.co_code[frame.f_lasti + 3] | ||
instruction_name = opcode.opname[next_instruction] | ||
if instruction_name == "STORE_FAST": | ||
return code.co_varnames[instruction_arg] | ||
elif instruction_name in ["STORE_NAME", "STORE_GLOBAL"]: | ||
return code.co_names[instruction_arg] | ||
elif ( | ||
instruction_name in ["LOAD_FAST", "LOAD_NAME", "LOAD_GLOBAL"] | ||
and opcode.opname[code.co_code[frame.f_lasti + 4]] in ["LOAD_CONST", "LOAD_FAST"] | ||
and opcode.opname[code.co_code[frame.f_lasti + 6]] == "STORE_SUBSCR" | ||
): | ||
if instruction_name == "LOAD_FAST": | ||
base_name = code.co_varnames[instruction_arg] | ||
else: | ||
base_name = code.co_names[instruction_arg] | ||
|
||
second_instruction = opcode.opname[code.co_code[frame.f_lasti + 4]] | ||
second_arg = code.co_code[frame.f_lasti + 5] | ||
if second_instruction == "LOAD_CONST": | ||
value = code.co_consts[second_arg] | ||
elif second_instruction == "LOAD_FAST": | ||
var_name = code.co_varnames[second_arg] | ||
value = frame.f_locals[var_name] | ||
else: | ||
value = None | ||
if value is not None: | ||
index_name = repr(value) | ||
return base_name + "[" + index_name + "]" | ||
else: | ||
return None | ||
else: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Helper function from pyprob | |
def _extract_target_of_assignment(depth): | |
frame = sys._getframe(depth) | |
code = frame.f_code | |
next_instruction = code.co_code[frame.f_lasti + 2] | |
instruction_arg = code.co_code[frame.f_lasti + 3] | |
instruction_name = opcode.opname[next_instruction] | |
if instruction_name == "STORE_FAST": | |
return code.co_varnames[instruction_arg] | |
elif instruction_name in ["STORE_NAME", "STORE_GLOBAL"]: | |
return code.co_names[instruction_arg] | |
elif ( | |
instruction_name in ["LOAD_FAST", "LOAD_NAME", "LOAD_GLOBAL"] | |
and opcode.opname[code.co_code[frame.f_lasti + 4]] in ["LOAD_CONST", "LOAD_FAST"] | |
and opcode.opname[code.co_code[frame.f_lasti + 6]] == "STORE_SUBSCR" | |
): | |
if instruction_name == "LOAD_FAST": | |
base_name = code.co_varnames[instruction_arg] | |
else: | |
base_name = code.co_names[instruction_arg] | |
second_instruction = opcode.opname[code.co_code[frame.f_lasti + 4]] | |
second_arg = code.co_code[frame.f_lasti + 5] | |
if second_instruction == "LOAD_CONST": | |
value = code.co_consts[second_arg] | |
elif second_instruction == "LOAD_FAST": | |
var_name = code.co_varnames[second_arg] | |
value = frame.f_locals[var_name] | |
else: | |
value = None | |
if value is not None: | |
index_name = repr(value) | |
return base_name + "[" + index_name + "]" | |
else: | |
return None | |
else: | |
return None | |
def _extract_target_of_assignment(depth) -> str: | |
"""Helper function to infer RV names from outer code. | |
Adapted from pyprob. | |
""" | |
try: | |
frame = sys._getframe(depth) | |
code = frame.f_code | |
next_instruction = code.co_code[frame.f_lasti + 2] | |
instruction_arg = code.co_code[frame.f_lasti + 3] | |
instruction_name = opcode.opname[next_instruction] | |
if instruction_name == "STORE_FAST": | |
return code.co_varnames[instruction_arg] | |
elif instruction_name in ["STORE_NAME", "STORE_GLOBAL"]: | |
return code.co_names[instruction_arg] | |
elif ( | |
instruction_name in ["LOAD_FAST", "LOAD_NAME", "LOAD_GLOBAL"] | |
and opcode.opname[code.co_code[frame.f_lasti + 4]] in ["LOAD_CONST", "LOAD_FAST"] | |
and opcode.opname[code.co_code[frame.f_lasti + 6]] == "STORE_SUBSCR" | |
): | |
if instruction_name == "LOAD_FAST": | |
base_name = code.co_varnames[instruction_arg] | |
else: | |
base_name = code.co_names[instruction_arg] | |
second_instruction = opcode.opname[code.co_code[frame.f_lasti + 4]] | |
second_arg = code.co_code[frame.f_lasti + 5] | |
if second_instruction == "LOAD_CONST": | |
value = code.co_consts[second_arg] | |
elif second_instruction == "LOAD_FAST": | |
var_name = code.co_varnames[second_arg] | |
value = frame.f_locals[var_name] | |
else: | |
value = None | |
if value is not None: | |
index_name = repr(value) | |
return base_name + "[" + index_name + "]" | |
else: | |
raise Exception() | |
else: | |
raise Exception() | |
except Exception as ex: | |
raise TypeError( | |
"Name could not be inferred for variable from surrounding " | |
"context. Pass a name explicitly as the first argument to " | |
"the Distribution." | |
) from ex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks that clarifies things!
This is a fun and interesting idea, and I really enjoy the inventiveness of the implementation! However, I'm not sure I really would want this to be in pymc. The main assumption of this is that people want to use the same name for a variable binding as for the (global) name of the variable in the trace. I just went through a couple of my models, and at least for me that just isn't the case. Less than half of the variables had the same name as the binding, especially for larger models, where variable names tend to get longer because they have to stay globally unique, and part of the model creation is in separate functions with local namespaces. Also from a more abstract perspective this seems a bit unintuitive: the global name of a variable and the name of the variable binding are quite different things, and if we did this I think pymc would be the only library I know that uses binding names internally. It also goes very much against general rules in python, where the code where a function is called doesn't matter inside the call. It also adds a lot of strangeness in the argument order of the RV constructors, where we need quite a bit of logic to figure out if there is a name or not. This seems to me to trade the small inconvenience of repeating a name from time to time for unique strangeness that goes against how python usually works at its core. That doesn't sound like a good tradeoff to me... If we want to improve variable naming, I think it would be more interesting to think about namespaces. I've been waiting for xarray to eventually support nesting, if that were to happen I think that would help greatly |
How does this fare with Distribution constructor helpers such as ZeroInflatedPoisson, OrderedLogistic, LKJCholeskyCov, which are not distributions themselves? Users should expect the same behavior but I don't want to make these cumbersome to write for developers. |
This won't prevent users from using names, so I don't see how this is an argument against. Of course users didn't bother to give unique assignment names in the past, it didn't matter. Obviously that changes if you are using assignment to provide the name. If you repeat it you'll get the usual repeated name error from the Model. What do you mean with local namespaces? |
This won't prevent users from using names, so I don't see how this is an argument against. Of course users didn't bother to give unique assignment names in the past, it didn't matter. Obviously that changes if you are using assignment to provide the name. If you repeat it you'll get the usual repeated name error from the Model.
True, it still is possible to specify the name. But this seems like a
pretty big change to me, after all with this we are effectively changing
how a call in python works. This might be a good idea if that is almost
always what we want, but if it isn't, do we really want to make such a deep
and unexpected change? Just imaging the confusion of someone familiar with
python but not pymc who is reading a model. Somehow there are names in the
trace, but those names were never actually passed into any function
anywhere! Or someone might rename a local variable (which never changes
behavior in any python code I've seen so far), and then some distant code
that analyses the trace breaks.
What do you mean with local namespaces?
I mean things like
```
def make_spacial_compontent():
mu = pm.Normal("spacial_mu")
sigma = pm.HalfNormal("spacial_sigma")
...
```
So the global variable names are more complicated, but because we only ever
see part of that in the function we use other variable names there.
|
If you pass a name to the model you get it as the prefix of every variable, so you could use a nested model for that already with this PR Edit: Still I don't see the issue as long as you can pass names explicitly. |
@aseyboldt I see your points, but I don't think you are the target audience for this feature. For me, this is really focused at beginners, and to them it's often not clear why they have to repeat things. Further optimizing the simplicity and readability of the HelloWorld model is pretty important I think. Intermediate and advanced users will probably keep using names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this fare with Distribution constructor helpers such as ZeroInflatedPoisson, OrderedLogistic, LKJCholeskyCov, which are not distributions themselves?
Users should expect the same behavior but I don't want to make these cumbersome to write for developers.
As expected... these type of classes would have to be tweaked in this PR.
import pymc as pm
with pm.Model() as m:
x = pm.ZeroInflatedPoisson(psi=0.5, mu=9)
# TypeError: ZeroInflatedPoisson.__new__() missing 1 required positional argument: 'name'
Thanks @aseyboldt for your thoughtful feedback. I think these are valid points and I'm not even sure beginners mind having to label their random variables. As I mentioned to Michael, much of the hackiness of this originates from trying to maintain full backwards-compatibility. Right now we have lots of user code that treats the name as the first argument. The metaprogramming bits though while very odd are no more odd than what happens with There are also larger design questions that maybe could get us to the same destination but using a more elegant route. I'm not sure we need names for the random variables. They seem to me to be mostly needed for visualisation and diagnostic plots. |
We need names to link variables to sampling results. Otherwise you don't know what is what in the results of prior predictive and sample. Posterior predictive, likewise, relies on names to know what's provided and what needs to be resampled. Storing and loading results with unnamed TensorVariables as the keys would be a mess. |
I think there is a small mess that comes from needing to hold onto the TensorVariables to use as keys. The big mess comes when you need to give them names to be plotted. This comes up in other PPLs like BeanMachine where random variables are decorated functions. The name became the function's name concatenated with its arguments. Which leads to plots that usually don't look the cleanest. |
Will this work for unicode variable names? People have a habit of using unicode mu, sigma, beta, etc. If yes, it's worth adding some tests for that. PS. I think this is great. Think it will allow for much easier on-ramping of beginners. |
Seems to work with unicode characters |
As Python has full unicode support I would expect it to just work, I don't think we need a test for that. |
@@ -164,6 +165,45 @@ def fn(*args, **kwargs): | |||
return fn | |||
|
|||
|
|||
# Helper function from pyprob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
URL to original commit on pyprob: pyprob/pyprob@6aac747
Manual naming would have to be retained in order to support things like re-usable model components. For example, def component_submodel(name, ...):
# Aging curve
c_age = pm.HalfNormal(f"c_age_{name}", 10)
offset_age = pm.HalfNormal(f"offset_age_{name}", 1)
... where the model instantiates several |
Well, it seems like I'm the only one to who this looks like too much magic, if that's the case then I guess it's fine. :-) I guess the reason why this feels like too much magic to me is that this API is not really part of the python language: If you only use API that is not CPython specific, you can not (as far as I know) implement this API at all. In order to implement it, we have to use internal data of CPython that isn't really meant to be used when programming in the language, and that other Python implementations (or future versions of CPython) might not even have. To quote from the CPython docs about some of those functions: https://docs.python.org/3/library/sys.html#sys._getframe
https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy (for the frame attributes)
https://docs.python.org/3/library/dis.html?highlight=dis#module-dis
At least bytecode also has changed quite a bit over the years as far as I am aware. This is in contrast to something like the model context, that's just a glorified global variable, and only uses quite simple features of the python language. I think it is fair to say that an API that can not be implemented in the language is a whole different level of magic. :-) Given that I don't think it is actually that helpful or at least important, (and @twiecki I think breaking rules of the language is actually worse for beginners than for people who have worked with that for a long time), it just doesn't feel right to me. This also raises some compatibility questions that I think we really should investigate at least a bit before we commit to anything like this: It is quite possible that this can't actually be implemented at all in other python implementations, including future CPython versions. I had a quick look and maybe pypy would work, but I'm not sure. We might also want to look at pyston, because I think there is some talk of using similar jit ideas in future CPython versions. I don't think it is particularly likely, but it really would be unfortunate if we go for an API like this, and a few CPython versions later this suddenly can't be done anymore, or it can only be done in a different subset of cases. |
Well, damn, you raise some good points @aseyboldt, need to let that sink in a bit more. One nit: "I think it is fair to say that an API that can not be implemented in the language is a whole different level of magic. :-)" I wouldn't say it's the API per se, because it's still just a function call, but rather the code implementing the API that's not Python. |
Adrian isn't the only one with concerns. If we do this, would we update the documentation to use this feature? All of it? Where would we draw the line? It's clearly violating the "explicit is better than implicit" rule, and because all of it is happening behind the scenes, it may create more confusion than benefits. After all, the usual API of passing names may be verbose, but it's definitely easy to understand. In particular for examples that opt for informative RV names and short variable names ( In the end I think its about trading off slope in the very early vs. medium stage learning curve. (Seemingly easier Quickstart examples at the cost of confusion about moderately complex models.) |
It's also true that after the first slightly negative reaction, people get
over it pretty quickly.
…On Mon, Dec 5, 2022, 23:14 Michael Osthege ***@***.***> wrote:
Adrian isn't the only one with concerns.
*If* we do this, would we update the documentation to use this feature?
All of it? *Where would we draw the line?*
It's clearly violating the "explicit is better than implicit" rule, and
because all of it is happening behind the scenes, it may create more
confusion than benefits. After all, the usual API of passing names may be
verbose, but it's definitely easy to understand. In particular for examples
that opt for informative RV names and short variable names (x =
pm.Data("time", [1,2,3])).
In the end I think its about trading off slope in the very early vs.
medium stage learning curve. (Seemingly easier Quickstart examples at the
cost of confusion about moderately complex models.)
—
Reply to this email directly, view it on GitHub
<#6364 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFETGGDLGN3WOTMAZSCBUDWLZSMJANCNFSM6AAAAAASSRXDM4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@aseyboldt makes some very good points. While I think having the option would be great, it sounds like it's not really an option. Maybe we can better sell why the current apparently verbose approach is beneficial. |
I don't think there is any benefit to the current approach, it's just the best we could do within Python's limitations (other than going the JAGS way, and defining models as large strings :D). You have to be verbose. Making that optional would be great. I agree that this is very inelegant, because we're trying to break out of Python's language. Still I think it's fine to exploit features only available in the CPython implementation. It's not like we are testing or explicitly supporting any other Python implementations. The strongest argument against this approach is that the byte-code can be changed at any time rendering this approach impossible / cumbersome in the future. It would suck to have to backtrack. |
I don't have the usual concerns with "too much magic" or being implicit rather than explicit in this case, because ultimately we are talking about naming variables here. If it saves users time and/or makes the model code more concise, then there is not really any downside to me, as long as the approach is robust and allows for corner cases to be accommodated. Moreover, as long as we retain the option to manually override with the |
In terms of the bytecode, while it can and does change a little every version release, this code seems to work across many version releases even going as far back as python 2.x |
That makes me feel better. |
Could you also add a test where the model has a name? To me, it looks like the variable autoname will still have the model name prepended to it, but I think that we should explicitly test that it does. |
Closing this as the reception was pretty mixed. Thanks anyway @zaxtax! |
Reopened, the floor is yours again |
I went through the thread and I also feel that this is too much magic for beginners. I agree that this magic helper reduces code for just a little bit, but it creates more trouble than good.
Sudden refactorings and iterations over model create issues back and forth, you refactor little scope code and it affects late visualizations. The modelling is not 1-shot, a thoughtful user iterates over the model and the easier it is the better. The verbose approach saves from thinking twice about naming and also saves from opportunistic coding but unexpected errors |
I should stress that the magic used here is not unheard of for even fairly popular Python packages https://lukeplant.me.uk/blog/posts/pythons-disappointing-superpowers/ What's the risk with a future version of python? I don't think this PR is ready to merge if it can't be used in a uniform way across the codebase. How common is this short name vs long name pattern in code? It feels if anything this feature encourages users to us descriptive variable names. In either case, I appreciate the concerns you have shared |
I wonder if its any more magical that what we currently do with the abuse of the Python context manager for adding variables to the model? This seems to be along the same lines, in terms of non-explicit behavior, except that we are giving variables names at the same time that we are having a graph built for us. I'm not yet sold myself, but it has the potential to be a major convenience with relatively little risk. |
This PR introduces a way to use distributions without needing to provide a name.
This means:
is equivalent to
I use some code originally written by @Tobias-Kohn for pyprob. The implementation is done by inspecting stack frames to figure out what the assignment will be. It picks earliest point an assignment can take place.
This PR is intended to be fully backwards-compatible.
Checklist